205 research outputs found

    From sequence to structure, to function, and back again: Integrating knowledge-based approaches with physical intuitions for protein folding, binding, and design

    Get PDF
    poster abstractMost biological activities are directed and/or regulated by proteins made of a gene-specified sequence of 20 amino-acid residue types. As a result, function or malfunction of specific proteins is responsible for almost all diseases. Proteins perform their function through their unique, self-assembled (folded) three-dimensional structures and through their specific binding to small molecules, to DNA/RNA (e.g. transcription factors that regulate gene expressions), or to other proteins (e.g. molecular recognition in signal transduction). Thus, how to predict the structure of a protein from its amino-acid sequence, discover the function from its structure and, then, design the sequence from its function or structure are the most essential problems in structural biology. In this poster, we will illustrate how the coupling of physical intuitions with learning from structural databases can go a long way toward untangling the complex relation between sequence, structure and function of proteins

    Folding thermodynamics of model four-strand antiparallel beta-sheet proteins

    Full text link
    The thermodynamic properties for three different types of off-lattice four-strand beta-sheet protein models interacting via a hybrid Go-type potential have been investigated. Discontinuous molecular dynamic simulations have been performed for different sizes of the bias gap g, an artificial measure of a model protein's preference for its native state. The thermodynamic transition temperatures are obtained by calculating the squared radius of gyration, the root-mean-squared pair separation fluctuation, the specific heat, the internal energy of the system, and the Lindemann disorder parameter. In spite of the simplicity, the protein-like heteropolymers have shown a complex set of protein transitions as observed in experimental studies. Starting from high temperature, these transitions include a collapse transition, a disordered-to-ordered globule transition, a folding transition, and a liquid-to-solid transition. These transitions strongly depend on the native-state geometry of the model proteins and the size of the bias gap. A strong transition from the disordered globule state to the ordered globule state with large energy change and a weak transition from the ordered globule state to the native state with small energy change were observed for the large gap models. For the small gap models no native structures were observed at any temperature, all three beta-sheet proteins fold into a partially-ordered globule state which is geometrically different from the native state. For small bias gaps at even lower temperatures, all protein motions are frozen indicating an inactive solid-like phase.Comment: PDF file, 32 pages including 13 figure page

    Accurate single-sequence prediction of solvent accessible surface area using local and global features

    Get PDF
    We present a new approach for predicting the Accessible Surface Area (ASA) using a General Neural Network (GENN). The novelty of the new approach lies in not using residue mutation profiles generated by multiple sequence alignments as descriptive inputs. Instead we use solely sequential window information and global features such as single-residue and two-residue compositions of the chain. The resulting predictor is both highly more efficient than sequence alignment-based predictors and of comparable accuracy to them. Introduction of the global inputs significantly helps achieve this comparable accuracy. The predictor, termed ASAquick, is tested on predicting the ASA of globular proteins and found to perform similarly well for so-called easy and hard cases indicating generalizability and possible usability for de-novo protein structure prediction. The source code and a Linux executables for GENN and ASAquick are available from Research and Information Systems at http://mamiris.com, from the SPARKS Lab at http://sparks-lab.org, and from the Battelle Center for Mathematical Medicine at http://mathmed.org

    LEAP: highly accurate prediction of protein loop conformations by integrating coarse-grained sampling and optimized energy scores with all-atom refinement of backbone and side chains

    Get PDF
    Prediction of protein loop conformations without any prior knowledge (ab initio prediction) is an unsolved problem. Its solution will significantly impact protein homology and template-based modeling as well as ab initio protein-structure prediction. Here, we developed a coarse-grained, optimized scoring function for initial sampling and ranking of loop decoys. The resulting decoys are then further optimized in backbone and side-chain conformations and ranked by all-atom energy scoring functions. The final integrated technique called loop prediction by energy-assisted protocol achieved a median value of 2.1 Ă… root mean square deviation (RMSD) for 325 12-residue test loops and 2.0 Ă… RMSD for 45 12-residue loops from critical assessment of structure-prediction techniques (CASP) 10 target proteins with native core structures (backbone and side chains). If all side-chain conformations in protein cores were predicted in the absence of the target loop, loop-prediction accuracy only reduces slightly (0.2 Ă… difference in RMSD for 12-residue loops in the CASP target proteins). The accuracy obtained is about 1 Ă… RMSD or more improvement over other methods we tested. The executable file for a Linux system is freely available for academic users at http://sparks-lab.org

    Web-based toolkits for topology prediction of transmembrane helical proteins, fold recognition, structure and binding scoring, folding-kinetics analysis and comparative analysis of domain combinations

    Get PDF
    We have developed the following web servers for protein structural modeling and analysis at http:// theory.med.buffalo.edu: THUMBUP, UMDHMMTMHP and TUPS, predictors of trans-membrane helical protein topology based on a mean-burial-propensity scale of amino acid residues (THUMBUP), hidden Markov model (UMDHMMTMHP) and their combinations (TUPS); SPARKS 2.0 and SP3, two profile– profile alignment methods, that match input query sequence(s) to structural templates by integrating sequence profile with knowledge-based structural score (SPARKS 2.0) and structure-derived profile (SP3); DFIRE, a knowledge-based potential for scoring free energy of monomers (DMONOMER), loop conformations (DLOOP), mutant stability (DMUTANT) and binding affinity of protein–protein/ peptide/DNA complexes (DCOMPLEX & DDNA); TCD, a program for protein-folding rate and transition-state analysis of small globular proteins; and DOGMA, a web-server that allows comparative analysis of domain combinations between plant and other 55 organisms. These servers provide tools for prediction and/or analysis of proteins on the secondary structure, tertiary structure and interaction levels, respectively

    SP5: Improving Protein Fold Recognition by Using Torsion Angle Profiles and Profile-Based Gap Penalty Model

    Get PDF
    How to recognize the structural fold of a protein is one of the challenges in protein structure prediction. We have developed a series of single (non-consensus) methods (SPARKS, SP2, SP3, SP4) that are based on weighted matching of two to four sequence and structure-based profiles. There is a robust improvement of the accuracy and sensitivity of fold recognition as the number of matching profiles increases. Here, we introduce a new profile-profile comparison term based on real-value dihedral torsion angles. Together with updated real-value solvent accessibility profile and a new variable gap-penalty model based on fractional power of insertion/deletion profiles, the new method (SP5) leads to a robust improvement over previous SP method. There is a 2% absolute increase (5% relative improvement) in alignment accuracy over SP4 based on two independent benchmarks. Moreover, SP5 makes 7% absolute increase (22% relative improvement) in success rate of recognizing correct structural folds, and 32% relative improvement in model accuracy of models within the same fold in Lindahl benchmark. In addition, modeling accuracy of top-1 ranked models is improved by 12% over SP4 for the difficult targets in CASP 7 test set. These results highlight the importance of harnessing predicted structural properties in challenging remote-homolog recognition. The SP5 server is available at http://sparks.informatics.iupui.edu

    Web-based toolkits for topology prediction of transmembrane helical proteins, fold recognition, structure and binding scoring, folding-kinetics analysis and comparative analysis of domain combinations

    Get PDF
    We have developed the following web servers for protein structural modeling and analysis at : THUMBUP, UMDHMM(TMHP) and TUPS, predictors of transmembrane helical protein topology based on a mean-burial-propensity scale of amino acid residues (THUMBUP), hidden Markov model (UMDHMM(TMHP)) and their combinations (TUPS); SPARKS 2.0 and SP(3), two profile–profile alignment methods, that match input query sequence(s) to structural templates by integrating sequence profile with knowledge-based structural score (SPARKS 2.0) and structure-derived profile (SP(3)); DFIRE, a knowledge-based potential for scoring free energy of monomers (DMONOMER), loop conformations (DLOOP), mutant stability (DMUTANT) and binding affinity of protein–protein/peptide/DNA complexes (DCOMPLEX & DDNA); TCD, a program for protein-folding rate and transition-state analysis of small globular proteins; and DOGMA, a web-server that allows comparative analysis of domain combinations between plant and other 55 organisms. These servers provide tools for prediction and/or analysis of proteins on the secondary structure, tertiary structure and interaction levels, respectively

    Folding rate prediction using total contact distance

    Get PDF
    © 2002 by the Biophysical SocietyLinear regression analysis found that either contact order (CO) or long-range order (LRO) parameter has a significant correlation with the logarithms of folding rates. This suggests that sequence separation per contact and total number of contacts are both important in determining the rate of folding. Here, the two factors are incorporated into a new parameter, total contact distance (TCD). Using a database of 28 two-state or weakly three-state folding proteins, TCD is found to be the most accurate among the three parameters (CO, LRO, and TCD) in terms of correlation and prediction. It provides even more accurate prediction than the best neural network results with two descriptors (contact order and stability per residue). The improvement is achieved in all three-structural classes (all _, _, and mixed). The accuracy of total contact distance in predicting folding rates is essentially unchanged if “short”-ranged contacts (_i _ j_ _ 14) are not included in calculation. Thus, only long-range contacts with a sequence separation of more than 14 residues are important in determining the rate of folding. This is consistent with the results from the long-range order parameter. One of the significant outliers in prediction is found to be associated with the only protein in the database that involves nonlocal disulfide bonds. Removing the protein leads to a correlation coefficient of 0.89 between experimental observed and predicted folding rates in jackknife cross validation. The corresponding values for CO and LRO are 0.71 and 0.80, respectively

    Protein Folding Pathways and Kinetics: Molecular Dynamics Simulations of β-Strand Motifs

    Get PDF
    AbstractThe folding pathways and the kinetic properties for three different types of off-lattice four-strand antiparallel β-strand protein models interacting via a hybrid Go-type potential have been investigated using discontinuous molecular dynamics simulations. The kinetic study of protein folding was conducted by temperature quenching from a denatured or random coil state to a native state. The progress parameters used in the kinetic study include the squared radius of gyration Rg2, the fraction of native contacts within the protein as a whole Q, and between specific strands Qab. In the time series of folding, the denatured proteins undergo a conformational change toward the native state. The model proteins exhibit a variety of kinetic folding pathways that include a fast-track folding pathway without passing through an intermediate and multiple pathways with trapping into more than one intermediate. The kinetic folding behavior of the β-strand proteins strongly depends on the native-state geometry of the model proteins and the size of the bias gap g, an artificial measure of a model protein's preference for its native state
    • …
    corecore